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assigned to the target is converted into a standardized average 
rating score for the target (SAR score). 


The distribution of the sum of ratings for the controls can 
be considered as the distribution of ratings associated with that 
condition. Reduced to the level of individual trials we assume this 
distribution to be typical for the condition and express all ratings 
in this distribution of average|ratings. Thus, all ratings are con- 
verted into standard normal s¢ores by computing its distance from 
br the controls of the trials and divid- 


ing it by the standard deviation observed for these average ratings. 


Thég for each trial a SAR score for the target is defined as 
the differettee between this standard normal score for the target 
and the average standard noymal score for target and controls. 
Since the SAR Seores are baged on true standard normal scores, 
which means scores, obtained jfrom a normal distribution, SAR scores 
can be considered neymal too. For each trial the sum of SAR scores 
for controls and targets is zero. Therefore, in the case of related 
samples we might compante ingividual achievement over conditions by 
calculating a product-moment correlation between the SAR scores of 


the two conditions. 


Although the randomizptivg test described above seems sta- 
tistically sound we further g uditd its properties, especially regard- 
ing its sensitivity to detect SP, \To this end we conducted a com- 
puter simulation of 100 "experiment for each combination of two 
variables. Each experiment] consisted of 20 trials and 5 pictures 
per trial and was simulated by randomhy generating 20 rows of 5 
numbers between rating vallies 0 and 30 inclusive. The two vari- 
ables involved were subjects’ rating beha ior and amount of ESP. 
For rating behavior we ma ipulated the prdpability of selecting rat- 
ing values of zero. The a ount of ESP was\operationalized as the 
number of subjects assigning the highest ratigg value to the target 
in addition to what could pe expected by chance. 


From the data obtained it can be concluded\that in most con- 


ditions the sensitivity of fhe SSR scores is rather Yow and less than 
that when, for instance, /a simple binomial test was Applied. Only 
in extreme cases of rating behavior and amount of ESP do the SSR 
scores become more sensative than the binomial test. For instance, 
in the case of 5 ESP hitBh when in total 5 + 15/5 = 8 hits can be ex- 
pected, the binomial yields an exact one-tailed probability of p = -01 
whereas the SSR score fyields on average a Z of 1.7 with an associ- 


ated one-tailed probabifity of 045. 


In the same simulation studies Stanford Z-scores were com- 
puted. We know that the distributions for these Z-scores are non- 
normal but leaving this aside we found that in most cases the sen- 
sitivity of t-test evaluations pased on Stanford Z-scores is compar- 
able to that of evaluations based on SSR scores. However, SSR 
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scores appear more sensitive than S mford Z-scores in cases of 
strong ESP and ext 


vactical conclusions can be drawn. 
the ESP influence on the data is 
there is reason to expect a strong 
ESP influence in the exper#nent ‘he binomial test can be assumed to 
be more sensitive than ar evaluatiéy based on the rating values. 
The same applies for exPeriments in which no extreme rating be- 
havior can be expected, for instance, an experiment in which 
an atomistic approac}/ to the judging is lowed. In that case we 
expect in general nonzero ratings assigned o all pictures, and our 
findings show thaf in that case the SSR scores, as well as Stan- 
ford's Z-scores, fare rather insensitive. 


From these findin 
In general we must assume 
relatively little. Hence, un. 


A METHODOLOGY FOR THE DEVELOPMENT OF A 
KNOWLEDGE-BASED JUDGING SYSTEM FOR FREE-RESPONSE 


MATERIALS 


Dick J. Bierman (Dept. of Psychology, University of Amsterdam) 


It has been found that certain judges perform consistently 
better than others when matching targets to a target set. It seems 
unlikely that this is purely because of the judge's psi, since psi 
generally does not display consistent behavior. Therefore, it might 
be hypothesized that it is the (intuitive) knowledge of the specific 
judge that accounts for his better performance on this task. It 
has been proposed (Morris, EJP, 1986, 137-149) that the use of 
expert systems might help psi researchers in tasks where they lack 
expertise, such as in the detection of fraud. Morris argues that 
the expertise of magicians could be formalized in such a system and 
made available to each individual researcher. Similarly, the exper~ 
tise of the best judges of free-response material could become avail- 
able through implementation of a knowledge-based free-response 
judging system. This use of techniques from the field of artificial 
intelligence (AI) to represent scarce knowledge should not be con- 
fused with the use of AI techniques for the representation of free- 
response material (Maren, RIP 1986, 97-99). According to Maren, 
the free-response material and the protocols should be represented 
in the form of trees in which the nodes are perceivable "objects," 
like "flames," and the links represent relations, like "adjacent to.” 
We expect that focusing our attention on the (knowledge used in 
the) human matching process might reveal more fundamental informay_ 
tion about the role of the meaning of the material. It is striking 
that in Maren's proposed representation of complex target material x 
only visual features are present. Actually, the type of visual 
matching that Maren proposes to be done by a machine can be bet- 
ter performed by any sighted human. 
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It should be remarked that the crucial element in the develop- 
ment of expert systems nowadays is not the implementation of the 
system but the elicitation of the knowledge that has to be entered 
into the system. In the case of knowledge about trickery, for in- 
stance, it is doubtful that one can find experts who are willing to 
transfer their knowledge. Apart from that, the detection of trick- 
ery is largely driven by visual information. The proper represen- 
tation of this visual knowledge might also be a major problem in this 
domain of expertise. In the case of free-response judging one can 
expect cooperation from the expert judges. Although the material 
is also visual there are strong indications that simple key words 
are able to represent these pictures satisfactorily. This conclusion 
can be drawn from the analytical judging procedures developed by 
Jahn et al. (Jahn et al., JP, 1980, 207-231). 


Analytical judging versus knowledge-based judging. It has 
been found that simple (linear) regression formulas make predictions 
comparable to or better than human experts in the domain of psy- 
chodiagnostics. Thus, it is not surprising that the analytical judg- 
ing procedure very similar to an approach by linear regression also 
yields satisfactory results. However, it should be noted that al- 
though its average performance is adequate, this approach fails in 
pathological cases. It appears that this is because of the failure to 
take into account any interaction between the predictor variables. 
In the analytical judging procedure, for instance, the simultaneous 
occurrence of two elements is counted as the sum of the scores for 
the cases when they occur alone. Thus, if two elements together 
have a symbolic meaning that is not contained in either element 
separately, this meaning is missed in the analytical judging pro- 
cedure. A knowledge-based judging system is capable of repre- 
senting and using this type of knowledge. 


Matching as classification task. Most problem-solving tasks 
can be seen as classification tasks. In the case of matching free~- 
response material from psi experiments, however, there is a special 
problem. Since the categories "correct match" and "incorrect 
match" in psi research are determined by chance, these categories 
do not have objective attributes. Thus, the task cannot be modeled 
as a direct classification task. Therefore, we propose to model the 
matching process as a double classification process. The judge is 
thought to begin with a classification of the protocol in one of his 
internalized categories. Secondly, this procedure is repeated for 
each of the members of the target set. Finally, the results of 
these classifications are evaluated using overlap measures. If no 
clear-cut match can be made a secondary evaluation is done which 
takes into account (subtle) interactions among attributes. 


Knowledge-elicitation methods. The elicitation of knowledge 
needed to drive expert systems is a "bottleneck problem." This 
was one of the reasons to simulate the research in machine-learning 
methods as a means of explicating knowledge. Very often the rather 
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unstructured interview approach is accompanied by so-called rapid 
prototyping. This means that the system is implemented while the 
knowledge base is essentially of low quality and incomplete. This 
might result in poor final systems, like most rule-based systems to 
date. If this is already the case for rather well-understood areas 
of human expertise it seems unwise to use an unstructured elicita~ 
tion procedure for the expertise of free-response judging. In more oO 
structured approaches emphasis is given to the necessity of a well- 
specified framework for interpretation of the verbal material, be it 
interviews with, or thinking aloud protocols produced by, the expert 
In the present paper it is proposed to combine the structured knowl 
edge-elicitation procedure with the use of learning systems. 
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Proposed procedure. The proposed methodology differs from 
accepted methodologies by using information already present in the 
data base of classified cases. The elicitation procedure consists of 
three major parts: (1) Learn, (2) Pathology detection, (3) Con- 
frontation. 
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In the first phase the expert judge will be interviewed on the 
set of attributes that are used to describe a target picture. Also, 
the primary set of classes is formulated. After that, a training set @ 
of old cases is selected to be presented to a learning system. EachQ 
ease consists of a series of attribute values together with the clas- 
sification by the expert judge. After the training the systems are < 
able to classify other cases from the old data base and to compare 
classifications of the target set with the classification of the proto- 
col. The trained system has become a (first-order) model of the 
expert judge. 
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In the second phase the remainder of the old data base is 
presented to the "trained" system for judging. If the judging by 
the system differs from that made in the past by the human expert © 
we call this a "patholdgical case." 
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In the third phase the human expert is confronted with the 
set of pathologies. The knowledge engineer might directly ask the 
expert why he or she deviated from the model or give him or her 
the cases to solve again while thinking aloud. Analysis of the 
thinking-aloud protocol should occur in terms of deviations from thefZ 
model and thus produce additions to the knowledge base. = 

ie) 

The automated concept learner. Previous work that tried to LL 
apply learning systems to the process of knowledge acquisition used O 
systems like Automated Concept Learning System (ACLS), which 2 
construct a decision tree from examples. However it was found thatO 
although the resulting decision trees were able to classify new cases 
properly, these trees, which represent the knowledge of the human @ 
expert, very often were hardly recognized by the same expert. <q 
This decision-tree representation offered therefore not a fruitful 


framework for the knowledge engineer to base his or her further 
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interviews. This situation is not very different from a representa- 
tion by linear regression models which have shown to have consid- 
erable predictive power. However, the linear regression formula 
does not make a lot of sense to the human expert. Therefore, we 
have proposed elsewhere not only to use an ACLS type of learning 
system but also to use a learning system that is supposed to create 
a psychologically valid representation of the human expert's knowl- 
edge. 


The prototype learner. The "prototype" model has been de- 
veloped by Rosch. In contrast with linear regression models, the 
"prototype" model allows for nonmonotonic relations between the 
values of the attributes and the class determination. So, apart 
from an implementation of a decision-tree building system 4 la ACLS, 
a system has been implemented that is capable of learning categor- 
ies as proposed in the Rosch model. During the learn phase a 
training set of old cases, consisting of the values of the attributes 
and the resulting classification, are offered to the system. The 
system learns which attributes contribute to which degree to the 
final classification decision. After the learning phase new cases 
ean be offered to the system which will calculate an overlap score 
of the new instance with the "prototype" of a class. 


Concluding remarks. Current work by the present author 
using a similar knowledge-elicitation approach in the domain of 
psychodiagnostics is promising. It appears that "intuitive" knowl- 
edge can be elicited with the proposed approach and implemented 
as a moderator of a primarily pattern-matching-based classification. 


NEW INTERPRETATIONS OF ESP LITERATURE* 


A CRITICAL REVIEW OF THE DISPLACEMENT EFFECT 


Julie Milton (Dept. of Psychology, University of Edinburgh, 
7 George Square, Edinburgh EH8 9JZ, Scotland)** 


The "displacerkent effect" in/ESP research refers to a situa- 
tion in which the per&ipient, instead of describing the intended 
target for a particular\rial, degcribes some other experimental ma- 
terial. Despite the fact\that ofer 100 papers have dealt with some 
aspect of the displacemen effect since the effect caught the general 
interest of parapsychologists im 1940, no exhaustive review of the 
displacement literature has abpeared. It was felt that such a re- 
view would be timely for an er of reasons, partly because the 
attitude of researchers these/ day&to the apparent occurrence of 
displacement is generally ong of irrXtation, whereas earlier research- 
ers reacted with a more posjftive (an hence possibly more produc- 
tive) interest; partly becaugfe recently some researchers have sug- 
gested that in the context pf finding Nmits for psi, the circumstances 
under which displacement pecurs and the extent to which displace- 
ment is a "deliberate" errpr or a genuike error on the part of the 
percipient may have some theoretical importance. Another reason 
for a review would be to examine the characteristics of displacement 
as a phenomenon of interest in itself. 


In the past, researchers have explored two main lines of re- 
search with respect to fisplacement; the first has involved the pos- 
sibility of a relationshig between scoring on targets of different 
displacements, and the/second, the possibility of a relationship be- 


tween displaced scoring and psychological and situational variables. 


Concerning the/possibility of a relationship between scoring 
on targets of differefit displacements, there are a couple of poten- 
tially important statjbtical artifacts that apply to forced-choice 
studies which can give rise to the appearance of displacement 


*Chaired by Erlendur Haraldsson. 
**] am grateful to the Perrott-Warrick Studentship in Psychical 
Research for financial support during the writing of this paper. 
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